A powerful approach to estimating annotation-stratified genetic covariance using GWAS summary statistics
نویسندگان
چکیده
~150 words) Despite the success of large-scale genome-wide association studies (GWASs) on complex traits, our understanding of their genetic architecture is far from complete. Jointly modeling multiple traits’ genetic profiles has provided insights into the shared genetic basis of many complex traits. However, large-scale inference sets a high bar for both statistical power and interpretability. Here we introduce a principled framework to estimate annotation-stratified genetic covariance between traits using GWAS summary statistics. Through theoretical and numerical analyses we demonstrate that our method provides accurate covariance estimates, thus enabling researchers to dissect both the shared and distinct genetic architecture across traits to better understand their etiologies. Among 50 complex traits with publicly accessible GWAS summary statistics (Ntotal ≈ 4.5 million), we identified more than 170 pairs with statistically significant genetic covariance. In particular, we found strong genetic covariance between late-onset Alzheimer’s disease (LOAD) and amyotrophic lateral sclerosis (ALS), two major neurodegenerative diseases, in SNPs with high minor allele frequencies and SNPs in the predicted functional genome. Joint analysis of LOAD, ALS, and other traits highlights LOAD’s correlation with cognitive traits and hints at an autoimmune component for ALS. . CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/114561 doi: bioRxiv preprint first posted online Mar. 7, 2017;
منابع مشابه
A Powerful Approach to Estimating Annotation-Stratified Genetic Covariance via GWAS Summary Statistics.
Despite the success of large-scale genome-wide association studies (GWASs) on complex traits, our understanding of their genetic architecture is far from complete. Jointly modeling multiple traits' genetic profiles has provided insights into the shared genetic basis of many complex traits. However, large-scale inference sets a high bar for both statistical power and biological interpretability....
متن کاملEstimating Effect Sizes and Expected Replication Probabilities from GWAS Summary Statistics
Genome-wide Association Studies (GWAS) result in millions of summary statistics ("z-scores") for single nucleotide polymorphism (SNP) associations with phenotypes. These rich datasets afford deep insights into the nature and extent of genetic contributions to complex phenotypes such as psychiatric disorders, which are understood to have substantial genetic components that arise from very large ...
متن کاملGWIS: Genome-Wide Inferred Statistics for Functions of Multiple Phenotypes.
Here we present a method of genome-wide inferred study (GWIS) that provides an approximation of genome-wide association study (GWAS) summary statistics for a variable that is a function of phenotypes for which GWAS summary statistics, phenotypic means, and covariances are available. A GWIS can be performed regardless of sample overlap between the GWAS of the phenotypes on which the function dep...
متن کاملJEPEGMIX: gene-level joint analysis of functional SNPs in cosmopolitan cohorts
MOTIVATION To increase detection power, gene level analysis methods are used to aggregate weak signals. To greatly increase computational efficiency, most methods use as input summary statistics from genome-wide association studies (GWAS). Subsequently, gene statistics are constructed using linkage disequilibrium (LD) patterns from a relevant reference panel. However, all methods, including our...
متن کاملUsing Linear Predictors to Impute Allele Frequencies from Summary or Pooled Genotype Data.
Recently-developed genotype imputation methods are a powerful tool for detecting untyped genetic variants that affect disease susceptibility in genetic association studies. However, existing imputation methods require individual-level genotype data, whereas in practice it is often the case that only summary data are available. For example this may occur because, for reasons of privacy or politi...
متن کامل